Sometimes I work on math stuff and need something like LaTeX to make laying things out clearer. This is a page where I'll write all that stuff out. Feel free to poke around with what I'm doing or leave notes.
1. L1 Pseudoinverse Thing
This is a temporary page I've made so that I can mess around with this L1-based version of the pseudoinverse I've been screwing around with. Don't mind it for now; I'll delete it and put something nicer up when I'm done.
Status: I now know enough vector calculus to solve this problem but ran into a theoretical snag since cheating and saying d|x|/dx = sign(x) doesn't work, but cangwu badness is more important for now anyway
WTF are you doing Mike
Back in the day, we had Tenney height and TOP. Then, Graham Gene figured out that the pseudoinverse makes the L2-based versions of these things a million times easier to compute. {I think that was me -- Gene} This is because the pseudoinverse makes the following problem easy:
If M is an mxn matrix that maps from R^n toR^m, then for some vector w in R^m, find the "shortest" vector v in R^nmapping to it.
Pretty much the entire appeal of the pseudoinverse is that it makes this problem extremely easy to solve if our notion of "shortest vector" happens to be built around the L2 norm: when solving Mv = w for the shortest vector v, just set v = pinv(M)*w. Since many tuning-related problems can be reduced to a situation like this, the pseudoinverse is a useful thing to have.
The reason the pseudoinverse is so magical is that for any mapping matrix M mapping from real vector spaces V -> W, the set of all such "shortest vectors" in V mapping to something in W happens to form an actual subspace of V. So if we want to find the shortest v in V mapping to some w in W, then when we come up with a function f: W -> V, we're actually just coming up with a linear map, which can be represented by a matrix multiplication. The pseudoinverse is guaranteed to give you that matrix multiplication.
For the L1 norm, unfortunately, this means that there's no simple matrix which just replaces the pseudoinverse which you can multiply things by to give you the result you want. Rather, the function mapping from W to V that we want is nonlinear. However, just because it's nonlinear doesn't necessarily mean that it's all that complicated, and in the case of L1 it may not be. We'll find out.
OK show me the money
Let's assume M is our mapping matrix from V -> W, w is our target vector, and v is any vector in V mapping to w under M. Then the Lp norm of v is represented by
where v_i are the different coordinates of v.
For now, let's assume that M is of full row rank, and that it's of codimension 1. Then let n be any vector lying in the 1D null space of M. Then we want to find this:
We can start by differentiating and searching for when the derivative is 0
which works out to
You can take the p out of the right summation and it cancels the 1/p on the left, so you get
If p=1, a ton of stuff cancels out and you get
More to come
2. Cangwu badness and exterior algebra
Cangwu badness was invented by Graham but the matrix algebra involved to understand it is complicated. It would be nice to simplify it to exterior algebra terms.
I conjecture that cangwu(W) is proportional to ||W^(ß+J)||, where W is a multival, J is the JIP and ß is a scalar (grade-0 multivector), so ß+J is a mixed-grade multivector. (EDIT: I think this is wrong now, but ||W^(ß+J)|| might be just as good and much simpler)
(This differs from logflat badness, which is ||W^J||*||W||^k for some magic k which makes the whole thing "logflat.")
In what follows, monzos are column vectors and vals are row vectors. Also, if I divide by a 1x1 matrix, just treat it like I'm dividing by a scalar; the determinant is assumed.
M = weighted mapping matrix, rows > cols
W = weighted multival obtained by taking the exterior product of the rows of M
r = rank of M
J = weighted JIP
Ek = the cangwu badness Ek parameter || · || = L2 norm. for a mixed grade element of the exterior algebra, still just take the L2 norm of all the coefficients as though it were a single multivector
A^B = wedge product of vectors A and B
A·B = dot product of vectors A and B, also used for matrix multiplication, context will always make this clear
Graham's Definitions (but using my variable names)
Cangwu Badness
(badness.pdf, eq. 3)
"Scalar" Badness (TE badness with some extra factors of ||J|| in there)
(primerr.pdf, eq. 80 (above in matrix algebra form), same as Cangwu badness with Ek=0)
"Scalar" Complexity (TE complexity with some extra factors of ||J|| in there)
(primerr.pdf, eq. 76, note the expansion between eq. 76 and eq. 77)
TOP-RMS Error (TE complexity with some extra factors of ||J|| in there)
If A, B are mixed-grade multivectors with no grades in common, then
Linear algebra notes so I don't forget
det(AB) = det(A)det(B)
det(A+B) = det([A;I]·[I;B]) = det([A;I])·det([I;B])
Remember C^n(M) = Λ^n(M), where C^n is compound matrix and Λ^n is exterior power
Other ways to write Graham's matrix algebra definitions
Cangwu Badness
Note that Scalar Badness is the same as the above, just with Ek=0. The most elegant one is
(easy to prove from "fundamental stuff" identities, generalization of what Graham proved in primerr.pdf eq. 98)
TOP-RMS Error
(Graham asserts this in eq. 39 in badness.pdf, no proof given but seems to check out)
Scalar Badness
(multiply the exterior algebra versions of scalar complexity and TOP-RMS error together)
Attempt to prove that my conjecture is wrong
Now we see that we have something which is close to, but not quite proportional to this way of writing Cangwu badness:
We don't really care about the denominator here, since we only care about proportionality, so we'll get rid of it. Also, since both things above have the square root being taken above, we can ditch those as well. Then we're left with the question of if, for some p, the following is true:
To clean this up a bit, we'll rewrite what's below as follows:
This isn't true in general, and I doubt that there's some special circumstance which makes it true in this case. However, this doesn't mean all is lost: Theorem 3.2 of this paper (http://rspa.royalsocietypublishing.org/content/459/2030/273.full.pdf) says how to turn det(A+B) into some exterior-algebraic expression, and we can start from there.
A second approach
Let's take a look at that B term and play a bit, keeping in mind that J·J' is a constant scalar term:
Let's now go back to this:
Substituting the results of the above in, we get
You'll note that both of these determinants involve an rxr matrix, so we can factor the J·J' out and cancel it on both sides:
Let's play with the right side of this equation real quick:
Table of Contents
1. L1 Pseudoinverse Thing
This is a temporary page I've made so that I can mess around with this L1-based version of the pseudoinverse I've been screwing around with. Don't mind it for now; I'll delete it and put something nicer up when I'm done.
Status: I now know enough vector calculus to solve this problem but ran into a theoretical snag since cheating and saying d|x|/dx = sign(x) doesn't work, but cangwu badness is more important for now anyway
WTF are you doing Mike
Back in the day, we had Tenney height and TOP. Then, Graham Gene figured out that the pseudoinverse makes the L2-based versions of these things a million times easier to compute. {I think that was me -- Gene} This is because the pseudoinverse makes the following problem easy:
If M is an mxn matrix that maps from R^n to R^m, then for some vector w in R^m, find the "shortest" vector v in R^n mapping to it.
Pretty much the entire appeal of the pseudoinverse is that it makes this problem extremely easy to solve if our notion of "shortest vector" happens to be built around the L2 norm: when solving Mv = w for the shortest vector v, just set v = pinv(M)*w. Since many tuning-related problems can be reduced to a situation like this, the pseudoinverse is a useful thing to have.
The reason the pseudoinverse is so magical is that for any mapping matrix M mapping from real vector spaces V -> W, the set of all such "shortest vectors" in V mapping to something in W happens to form an actual subspace of V. So if we want to find the shortest v in V mapping to some w in W, then when we come up with a function f: W -> V, we're actually just coming up with a linear map, which can be represented by a matrix multiplication. The pseudoinverse is guaranteed to give you that matrix multiplication.
For the L1 norm, unfortunately, this means that there's no simple matrix which just replaces the pseudoinverse which you can multiply things by to give you the result you want. Rather, the function mapping from W to V that we want is nonlinear. However, just because it's nonlinear doesn't necessarily mean that it's all that complicated, and in the case of L1 it may not be. We'll find out.
OK show me the money
Let's assume M is our mapping matrix from V -> W, w is our target vector, and v is any vector in V mapping to w under M. Then the Lp norm of v is represented by
where v_i are the different coordinates of v.
For now, let's assume that M is of full row rank, and that it's of codimension 1. Then let n be any vector lying in the 1D null space of M. Then we want to find this:
We can start by differentiating and searching for when the derivative is 0
which works out to
You can take the p out of the right summation and it cancels the 1/p on the left, so you get
If p=1, a ton of stuff cancels out and you get
More to come
2. Cangwu badness and exterior algebra
Cangwu badness was invented by Graham but the matrix algebra involved to understand it is complicated. It would be nice to simplify it to exterior algebra terms.I conjecture that cangwu(W) is proportional to ||W^(ß+J)||, where W is a multival, J is the JIP and ß is a scalar (grade-0 multivector), so ß+J is a mixed-grade multivector. (EDIT: I think this is wrong now, but ||W^(ß+J)|| might be just as good and much simpler)
(This differs from logflat badness, which is ||W^J||*||W||^k for some magic k which makes the whole thing "logflat.")
In what follows, monzos are column vectors and vals are row vectors. Also, if I divide by a 1x1 matrix, just treat it like I'm dividing by a scalar; the determinant is assumed.
Links
Primerr - http://x31eq.com/primerr.pdfCangwu Badness - http://x31eq.com/badness.pdf
Terms
M = weighted mapping matrix, rows > colsW = weighted multival obtained by taking the exterior product of the rows of M
r = rank of M
J = weighted JIP
Ek = the cangwu badness Ek parameter
|| · || = L2 norm. for a mixed grade element of the exterior algebra, still just take the L2 norm of all the coefficients as though it were a single multivector
A^B = wedge product of vectors A and B
A·B = dot product of vectors A and B, also used for matrix multiplication, context will always make this clear
Graham's Definitions (but using my variable names)
Cangwu Badness(badness.pdf, eq. 3)
"Scalar" Badness (TE badness with some extra factors of ||J|| in there)
(primerr.pdf, eq. 80 (above in matrix algebra form), same as Cangwu badness with Ek=0)
"Scalar" Complexity (TE complexity with some extra factors of ||J|| in there)
(primerr.pdf, eq. 76, note the expansion between eq. 76 and eq. 77)
TOP-RMS Error (TE complexity with some extra factors of ||J|| in there)
(primerr.pdf, page 29, leading to equation 111)
Note: Scalar badness = Scalar complexity * TOP-RMS error
The Big List of Identities
Fundamental stuff
If A, B are mixed-grade multivectors with no grades in common, thenLinear algebra notes so I don't forget
det(AB) = det(A)det(B)det(A+B) = det([A;I]·[I;B]) = det([A;I])·det([I;B])
Remember C^n(M) = Λ^n(M), where C^n is compound matrix and Λ^n is exterior power
Other ways to write Graham's matrix algebra definitions
Cangwu BadnessNote that Scalar Badness is the same as the above, just with Ek=0. The most elegant one is
Scalar Complexity
TOP-RMS Error
Stuff relating wedgies and mapping matrices
Norm of wedgie identityThe above, but for some scalar k
||W^J|| translated into matrix form
That third step uses the fourth identity here: http://en.wikipedia.org/wiki/Determinant#Block_matrices.
Exterior algebra versions of Graham's definitions
Scalar Complexity(easy to prove from "fundamental stuff" identities, generalization of what Graham proved in primerr.pdf eq. 98)
TOP-RMS Error
(Graham asserts this in eq. 39 in badness.pdf, no proof given but seems to check out)
Scalar Badness
(multiply the exterior algebra versions of scalar complexity and TOP-RMS error together)
Attempt to prove that my conjecture is wrong
Now we see that we have something which is close to, but not quite proportional to this way of writing Cangwu badness:
We don't really care about the denominator here, since we only care about proportionality, so we'll get rid of it. Also, since both things above have the square root being taken above, we can ditch those as well. Then we're left with the question of if, for some p, the following is true:
To clean this up a bit, we'll rewrite what's below as follows:
This isn't true in general, and I doubt that there's some special circumstance which makes it true in this case. However, this doesn't mean all is lost: Theorem 3.2 of this paper (http://rspa.royalsocietypublishing.org/content/459/2030/273.full.pdf) says how to turn det(A+B) into some exterior-algebraic expression, and we can start from there.
A second approach
Let's take a look at that B term and play a bit, keeping in mind that J·J' is a constant scalar term:Let's now go back to this:
Substituting the results of the above in, we get
You'll note that both of these determinants involve an rxr matrix, so we can factor the J·J' out and cancel it on both sides:
Let's play with the right side of this equation real quick: